Technique for freeing renamed registers

ABSTRACT

Register renaming circuitry for a processing apparatus configured to process a stream of instructions from an instruction set specifying registers from an architectural set of registers. The apparatus including a physical set of registers configured to store data values being processed by the processing apparatus. Register renaming circuitry is configured to receive a stream of operations from an instruction decoder and to map registers that are to be written to by the stream of operations to physical registers within the physical set of registers that are currently available. The register renaming circuitry comprises register release circuitry configured to release the physical registers that have been mapped to the registers when a first set of conditions have been met, and to release the physical registers that have been mapped to the additional registers when a second set of conditions have been met.

BACKGROUND

The field of the invention relates to data processing and in particularto register renaming in a processing apparatus.

It is known to provide data processing apparatus which processinstructions from an instruction set that specifies registers using anarchitectural set of registers, while the apparatus itself uses aphysical set of registers that is larger than the architectural set.This is a technique that has been developed to try to avoid resourceconflicts due to instructions executing out of order in the processor.In order to have compact instruction encodings most processorinstruction sets have a small set of register locations that can bedirectly named. These are often referred to as the architectureregisters and in many ARM® (registered trade mark of ARM Ltd CambridgeUK) RISC instruction sets there will be 32 architecture registers.

When instructions are processed different instructions take differentamounts of time to complete. In order to speed up execution times,processors may have multiple execution units, and may perform out oforder execution. This can cause problems if the data used by theseinstructions is stored in a very limited register set as a value storedin one register may be overwritten before it is used by anotherinstruction. This leads to errors. In order to address this problem itis known for some processing cores to perform processing using moreregisters than are specified in the instruction set. Thus, for example,a core may have 56 physical registers to process an instruction sethaving 32 architecture registers. This enables a core to store values inmore registers than is specified by the instruction set and can enable avalue needed by an instruction that takes a long time to be executed tobe stored in a register not used by other neighbouring instructions. Inorder to be able to do this the core needs to “rename” the registersreferred to in the instruction set so that they refer to the physicalregisters in the core. In other words an architectural register referredto in the instruction is mapped onto a physical register that isactually present on the core.

Renaming of the registers is generally done using a renaming table whichmaps registers from the architecture set of registers to registers inthe physical set. The renaming occurs early in the processing pipelinegenerally shortly after decode and it is important that the mapping iskept until the instruction has completed and any other instruction thatneeds to read from the register written to has also completed. However,at a certain point the physical register that was mapped to thearchitectural register will need to be released so that it can be usedin another mapping otherwise the processor will run out of physicalregisters to map to. Generally a set of conditions are applied that whenmet indicate that a mapping is no longer required and the physicalregister can be released. The conditions include that the register writehas been performed and that the mapping is no longer in the renamingtable. There are further conditions that are required in processorswhere exceptions may occur to ensure that the processor can be restartedfollowing an exception, thus, the mappings that were current at a pointwhere speculative execution starts need to be stored in a restore tableand the physical registers present in this table should not be remappeduntil it is known that the speculatively executed instructions willcomplete. An exception may occur where instructions are executedspeculatively and it is determined that the prediction that triggeredthe execution was not correct. Non-exception instructions areinstructions that execute in a statically determined way such that it isknown that they will complete.

Further problems may arise for source registers of store instructionswhich may have a very long latency if the store misses at the addresstranslation level, where a virtual address is translated to a physicaladdress, whereupon the instruction may remain stalled for a long timeduring which time the physical register used to hold the value that isto be written to memory must not be overwritten by another instructionthus, it must not be available for renaming. This is addressed using a“snapshot” where a record of source registers for pending stores is keptand the state of the processor core monitored, the register renamingcircuitry avoids renaming these registers until it is determined thatthe store has completed.

Thus, the conditions for freeing renamed registers are in cases complexand can lead to registers being unavailable for a significant amount oftime. It would be desirable to be able to identify situations whereregisters do not need to meet all of these conditions such that they canbe freed more quickly and easily.

SUMMARY

A first aspect of the present technique provides register renamingcircuitry for a processing apparatus, said processing apparatus beingconfigured to process a stream of instructions from an instruction set,said instructions specifying registers from an architectural set ofregisters, said processing apparatus comprising a physical set ofregisters configured to store data values being processed by saidprocessing apparatus;

said register renaming circuitry being configured to receive a stream ofoperations from an instruction decoder within said processing apparatusand to map registers that are to be written to by said stream ofoperations to physical registers within said physical set of registersthat are currently available;

said register renaming circuitry being configured to identify additionalregisters that are registers that are to be written to by saidoperations that are not from said architectural set of registers;

said register renaming circuitry comprising register release circuitryfor releasing physical registers that have been mapped such that theyare available for register renaming;

said register release circuitry being configured to release saidphysical registers that have been mapped to said registers from said setof architectural registers when a first set of conditions have been met,and to release said physical registers that have been mapped to saidadditional registers when a second set of conditions have been met, atleast some of said conditions within said first set of conditions beingdifferent to said conditions within said second set of conditions.

The present technique recognises that where registers are being mappedthat are not part of the architectural set of registers then, they arenot visible to the programmer and they may have different properties andtheir use may follow different rules. In such a case then the conditionsthat generally need to be applied to release mapped architecturalregisters the so called first set of conditions may not apply to theseregisters and the recognition of this allows a different or second setof conditions to be followed which in many cases are simpler and allow aquicker way of determining when the registers may be freed.

In this regard the second set of conditions are generally lessrestrictive than the first set of conditions so that the registers maybe freed more quickly and with generally a lower requirement formonitoring the state of the processing apparatus.

In some embodiments, said stream of operations comprise at least somemicro-operations, wherein said micro-operations are generated by saidinstruction decoder by splitting at least one of said instructions fromsaid stream of instructions into a plurality of said micro-operations,at least one of said micro-operations specifying at least one of saidadditional registers.

One case that may arise where registers additional to those within thearchitectural set are specified by an operation is where the decodersplits an instruction into a set of micro-operations. Thesemicro-operations may need to transmit data between themselves and to dothis they use what are often called temporary or additional registers.One of the micro-operations will therefore specify a register it is towrite to and another will read from this register. In such a case, asthese registers are not in the architectural set, other instructionswill not access these registers and thus, when the micro-operations thatthe instruction was formed into have completed the value stored in theregister will no longer be required and the physical register used tostore this value can be freed. Thus, provided one can determine when themicro-operations have completed some of the more general constraintsrequired to free a register may be ignored in this case.

In some embodiments, said register renaming circuitry comprises at leastone counter associated with said at least one additional register, saidregister renaming circuitry is configured in response to receiving amicro-operation that writes to said at least one additional register tocommence counting, said second set of conditions including said counterassociated with said additional register having counted to apredetermined value.

One simple and area efficient way of determining one of the second setof conditions that needs to be met to free a register is by the use of acounter. As noted previously, where an instruction is divided into twomicro-operations it is only the micro-operations that came from theoriginal instructions that will require the use of particularregister(s). Therefore, if one knows the number of micro-operations thatthe instruction was split into when that number of micro-operations haspassed through the renaming circuitry one can deduce that theregister(s) used by the micro-operations are no longer required and thatthey can therefore be freed as far as this portion of the circuit isconcerned. Thus, a counter associated with the register renamingcircuitry and the additional register(s) can be used to determine one ofthe second set of conditions for determining when a physical registercan be freed.

In some embodiments, said counter counts a number of clock cycles whilein other embodiments said counter counts a number of operations receivedat said register renaming circuitry.

The number of micro-operations that pass through the renaming circuitrydetermines when the split instruction has gone through the renamingcircuitry. However, it is also recognised that in normal operation theregister renaming circuitry receives an operation for every clock cycle.Thus, rather than counting operations one can count clock cycles. Anadvantage of this is that clock cycles may be simpler to count thanoperations and also where the register renaming circuitry does notreceive an operation on each clock cycle then an exception will haveoccurred which has resulted in some stalling of the processingcircuitry. In such a case, it is likely that the followingmicro-operations will not be the ones from the instruction generatingthe earlier micro-operations and thus, the counting may as well continueand this condition be met as this instruction will not complete and theregister renaming issues will be handled by exception circuitry.

In some embodiments, said predetermined value comprises a maximum numberfor all of said instructions that are split into micro-operations bysaid instruction decoder, of micro-operations between a micro-operationthat is writing to said additional register and a micro-operation thatreads said additional register.

Although, the counters could be set to the number of micro-operationsthat each instruction is split into, it has been recognised that wherethe counter is set to count to a higher value than is required, then itwill simply continue counting when subsequent operations are receivedand will therefore reach the required value in time. As other conditionsin the second set of conditions will generally take longer to befulfilled than the counter value reaching the desired value then this isgenerally not a problem. Thus, the value chosen may be the maximumnumber for all of the instructions that are split into micro-operationsbetween the micro-operation that is writing to the additional registerand the final micro-operation that reads from it. This is the maximumvalue that is ever going to be needed. It should be noted that thecounter needs to count to this value and this can be either done bysetting it to 0 and recognising when it reaches this value or it can beset to that value and count down and the circuitry can recognise when itreaches 0, or some other means of counting can be used.

In some embodiments, said second set of conditions further comprisesthat said mapped register has been written to, said second set ofconditions being met when said counter associated with said additionalregister has counted to said predetermined value and said physicalregister that said additional register was mapped to has been writtento.

An additional constraint that needs to be met is that the physicalregister that the additional register was mapped to has been written to.In this regard, the register renaming circuitry will be in theprocessing pipeline of the apparatus and it may take some time betweenthe registers being renamed and the register being written. As theinstruction has already entered the pipeline when it is renamed theoperations will flow through to the writing stage in order and thus,once one knows from the counter value that the micro-operations haveleft the register renaming circuitry and once one knows that theregister has been written, the conditions to free it have been met asone can be sure that the operations that are reading the register willbe executed before any operations that might write to the registerprovided no exception occurs. If an exception occurs between the writingof the register and the reading of it then the fact that it is freedalso does not matter as the operations that needed to read the registerwill not complete.

As can be appreciated the second set of conditions in this case aresimply that the register has been written to and that the counter hasreached a predetermined value. These are simple conditions to monitorand occur quite quickly allowing the registers to be freed in a simpleand efficient manner.

In some embodiments, said physical registers comprise a valid indicatorassociated with them, said register renaming circuitry being configuredto set said valid indicator to invalid on mapping a physical register,said valid indicator being set to valid when said register is written.

One way the register renaming circuitry can determine when the physicalregister has been written to is by the use of a valid indicatorassociated with the physical register bank. In such a case the registerrenaming circuitry will set this indicator to invalid when it remaps theregister and on the register being written the data processing apparatuswill set it to valid. The register renaming circuitry will recognisethat the register has been written when it detects the valid signal.

In some embodiments, said register renaming circuitry is furtherconfigured to determine when an operation is received that writes to oneof said additional registers that is currently mapped and stored in arenaming buffer, said register renaming circuitry being configured togenerate a signal to indicate to said register release circuitry thatsaid counter associated with said one of said additional registers hasreached said predetermined value irrespective of a value of said counterand to reset said counter.

In some cases, the register renaming circuitry will receive a write toone of the additional registers when it is currently mapped. In such acase, the register renaming circuitry will recognise that all of theoperations that might have read that mapped register will havecompleted, as operations to the renaming circuitry are received inorder. Thus, it will reset the counter to a starting value as it willneed to count the next set of micro-operations but it will send a signalindicating that the counter has reached the required value for thepreviously mapped physical register even if it has not reached thisvalue. This is because it knows that the micro-operations have completedand the counter is counting to the value of the maximum number ofmicro-operations of all split instructions. Thus, if the instruction issplit into fewer micro-operations than the maximum value set for thecounter, a new operation may be received from a different instructionprior to the counter reaching the required value and this can be used asthe signal to reset the counter and indicate that the condition of thecounter reaching the required value has been met.

In some embodiments, said register renaming circuitry further comprisesa renaming buffer for storing a plurality of mappings of said registersspecified by said operations received from said instruction decoder atsaid register renaming circuitry to said physical registers, saidregister renaming circuitry being configured to add a new mapping tosaid renaming buffer on receipt of an operation specifying a registerand on receipt of an operation specifying a register that is alreadymapped in said renaming buffer to remove said mapping from said renamingbuffer and to update said renaming buffer with a new mapping;

and said first set of conditions comprises that:

said physical register has been written to:

said physical register is not present in said renaming buffer; and

said physical register is not stored in said data processing apparatusas a restore register for restoring a set of mapping that was valid at apoint prior to execution of speculative instructions.

As noted previously the first set of conditions are generally morerestrictive than the second set of conditions as the registers from thearchitectural set can be referred to by different instructions withinthe instruction stream and thus, the moment they can be freed isconstrained by various factors. Thus, as for the additional registers,the physical register should have been written to, but in addition tothis they should not be present in the renaming buffer and should not bepresent as a restore register. Restore registers are registers that arepresent in restore mappings that the processing apparatus stores inorder to be able to restore a state of the processing apparatusfollowing an exception. Many processors execute instructionsspeculatively assuming that certain conditions will be met. If theseconditions are not met then an exception is generated and the processorneeds to be able to rewind back to a point before the instructions werespeculatively executed. Where register renaming occurs the previousmappings at that point also need to be restored and the values in thephysical registers mapped by these mappings must not have beenoverwritten. Thus, any register that may have its mapping restoredshould not be freed as the value stored in that register may yet beneeded. The additional registers do not suffer from these constraints asthey are not specified by other instructions but are only used by othermicro-operations generated from the same instruction and thus, a pointthat the apparatus may wind back to is not a point where one of them iswritten and the value in it needed by subsequent instructions that willbe executed.

In some embodiments, said first set of conditions further comprises thatsaid physical register is not mapped to a source register of a decodedstore instruction that has not completed.

A further constraint that the first set of conditions may have to followis where a source register of a decoded store instruction has notcompleted. Store instructions store values from registers into memory.They can have considerable latency and it is important that the registeris not overwritten before the value is stored. Thus, these registersneed to be monitored when freeing registers for renaming.

A second aspect of the present technique provides a data processingapparatus comprising:

a register bank comprising a set of physical registers;

at least one instruction decoder for decoding a stream of instructions;

register renaming circuitry according to a first aspect of the presentinvention; and

at least one processor for processing said stream of instructions.

In some embodiments, said at least one data processor comprises a dataengine; and

said at least one instruction decoder is configured to split at leastone predetermined instruction from said stream of instructions into aplurality of micro-operations and to send said plurality ofmicro-operations to be processed by said data engine.

The processing apparatus may comprise a data engine either on its own orin conjunction with another controlling processor. If a controllingprocessor is present, the controlling processor may have its owninstruction decoder which recognises instructions that are to beprocessed by the data engine and transmits them to the instructiondecoder of the data engine. The instruction decoder of the data enginewill split at least some of these instructions into micro-operationsthat are to be processed by the data engine. These micro-operations mayspecify additional registers which when mapped are freed according tothe second set of conditions.

In some embodiments, said register bank comprises a valid bit associatedwith each of said registers, said register renaming circuitry beingconfigured when mapping a physical register to update said valid bitassociated with said physical register to invalid and said dataprocessing apparatus being configured to set said valid bit to valid onwriting to said physical register.

As noted previously, one way of determining when a register is writtenis by the use of a valid bit which is set to invalid by the registerrenaming circuitry when mapping a physical register and is set to validby the processing circuitry when writing to the physical register.

In some embodiments, said data processing apparatus further comprisesexception handling circuitry for handling exceptions, said exceptionhandling circuitry comprising:

an exception data store for storing register mappings for registerswritten to by operations that are speculatively executed and restoremappings for said registers, such that if an exception occurs saidmapping can be restored to a previous state;

said register renaming circuitry being responsive to receipt of anexception to update said renaming table using said restore mappings fromsaid exception data store, and to determine whether at least one of saidadditional registers has been mapped by said speculatively executedoperations that are aborted and if so to generate a signal to indicateto said register release circuitry that said second set of conditions ismet when said at least one counter associated with said at least one ofsaid additional registers has counted to said predetermined value.

Where data processing apparatus execute instructions speculatively thenwhere the speculatively executed instructions need to be aborted as theyshould not be executed, exception handling circuitry is required to beable to rewind back to the place where the speculation started. Whereregister renaming occurs then the previous mappings at this point needto be stored and the physical registers of these mappings should nothave been freed as the values stored in them will be needed. Thus, theexception handling circuitry comprises an exception data store thatstores information regarding the mappings for each register that arespecified by each speculatively executed operation and a previousrestore mapping indicating a previous mapping for that register. Wherean exception occurs and the speculatively executed operations areaborted, the renaming table is updated with the restore mappings andwhere any of the additional registers have been mapped by thespeculatively executed operations a signal is generated to indicate tosaid register release circuitry that the second set of conditions is metwhen the counter has counted to the predetermined value. In this regard,where an exception occurs it may be that the register is never writtenand therefore, if no signal is sent from the exception circuitry thesecond set of conditions might never be met and the register neverreleased. Thus, some record of the renaming of the additional registeris kept within the exception circuitry and this is used to override thewriting to the register condition, perhaps by masking the invalidindicator or forcing it to valid.

In some embodiments said data processing apparatus further comprises anexception handling circuitry configured in response to detecting anexception to force any pending writes to said at least one additionalregister to complete prior to flushing operations from said processor.

An alternative method of handling the exception is simply to force anypending writes to the additional registers to complete before flushingoperations from the processor. This will set the valid bit to valid andwill enable the second set of conditions to be met when the counter hasreached the value.

In this regard, although the forcing of the writes will achieve correctoperation, the use of the exception data store to store informationabout the additional register may be simpler as it reuses the hardwareof the exception data store that is required for restoring the values inany case and does not require monitoring and control of the pipeline.

A third aspect of the present invention provides a method of renamingregisters within a processing apparatus, said processing apparatus beingconfigured to process a stream of instructions from an instruction set,said instructions specifying registers from an architectural set ofregisters, said processing apparatus comprising a physical set ofregisters configured to store data values being processed by saidprocessing apparatus;

said method comprising:

receiving a stream of operations from an instruction decoder within saidprocessing apparatus and mapping registers that are to be written to bysaid stream of operations to physical registers within said physical setof registers that are currently available;

identifying additional registers that are registers that are to bewritten to by said operations that are not from said architectural setof registers;

releasing said physical registers that have been mapped to saidregisters from said set of architectural registers such that they areavailable for remapping when a first set of conditions have been met,and

releasing said physical registers that have been mapped to saidadditional registers such that they are available for remapping, when asecond set of conditions have been met, at least some of said conditionswithin said first set of conditions being different to said conditionswithin said second set of conditions.

A fourth aspect of the present invention provides, register renamingmeans for renaming registers for a processing means, said processingmean being for processing a stream of instructions from an instructionset, said instructions specifying registers from an architectural set ofregisters, said processing means comprising a physical set of registersconfigured to store data values being processed by said processingmeans;

said register renaming means being for receiving a stream of operationsfrom an instruction decoding means within said processing means and formapping registers that are to be written to by said stream of operationsto physical registers within said physical set of registers that arecurrently available;

said register renaming means being for identifying additional registersthat are registers that are to be written to by said operations that arenot from said architectural set of registers;

said register renaming means comprising register release means forreleasing physical registers that have been mapped such that they areavailable for register renaming;

said register release circuitry being for releasing said physicalregisters that have been mapped to said registers from said set ofarchitectural registers when a first set of conditions have been met,and for releasing said physical registers that have been mapped to saidadditional registers when a second set of conditions have been met, atleast some of said conditions within said first set of conditions beingdifferent to said conditions within said second set of conditions.

The above, and other objects, features and advantages of this inventionwill be apparent from the following detailed description of illustrativeembodiments which is to be read in connection with the accompanyingdrawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 schematically shows a representation of the renaming table andphysical registers;

FIG. 2 shows a data processing apparatus according to an embodiment ofthe present invention;

FIG. 3 shows register renaming circuitry according to an embodiment ofthe present invention;

FIG. 4 shows an exception FIFO used in embodiment of the presentinvention;

FIG. 5 shows a portion of the register renaming circuitry according toan embodiment of the present invention;

FIG. 6 shows an example of an instruction that is split intomicro-operations and the counter values that are set in response toreceiving these in the register renaming circuitry;

FIG. 7 shows a further example of instructions that are split intomicro-operations;

FIG. 8 shows a further example of an instruction stream including aninstruction that is split into micro-operations;

FIG. 9 shows a flow diagram illustrating steps in a method according toan embodiment of the present invention; and

FIG. 10 shows a flow diagram illustrating what happens when an exceptionis received.

DESCRIPTION OF EXAMPLE EMBODIMENTS

FIG. 1 schematically shows a renaming table that shows the set ofarchitectural registers that can be named by a programmer'sinstructions, and an additional two registers that cannot be named by aprogrammer but can be used by operations executed by a processor, forexample operations generated by a decoder decoding certain instructions.Thus, each of these registers when used will need to be mapped to aphysical register in the physical register bank 10.

In this example there are 32 architectural registers and thus, five bitsare required to identify them. There are an additional two registers andan additional bit is needed to identify these. Thus, in this example,there are six bits that identify the registers specified by theoperations, one of the bits indicating whether they are additionalregisters or not.

FIG. 1 also shows the physical registers that are present in physicalregister bank 10. In this example they are double data registers suchthat each register can store 128 bits, there is also an additional validbit associated with each register. The first 16 registers can beaccessed as two sets of single data registers of 64 bits, while theremaining registers are accessed simply as double data registers and ifsingle data is being used the second half of them is not used. There aremore physical registers than there are architectural and additionalregisters and these are provided to enable a processor to processinstructions speculatively and out of order. The architectural andadditional registers are mapped to actual physical registers and themappings are stored so that the data that they write can be read byother operations by accessing these registers.

FIG. 2 shows a data processing apparatus 20 according to an embodimentof the present invention. The data processing apparatus comprises a mainprocessor core 22 and a data engine 24. The processor core has a fetchunit 30 that fetches instructions from an instruction cache not shownand in this example it is a dual stream processing pipeline andtherefore there are two instructions fetched in parallel. Theseinstructions are then sent to decode unit 32 where an initial decode isperformed. The decode unit identifies instructions that can beefficiently processed by the data engine and sends them to the dataengine instruction decoder 44. The other instructions that are notsuitable for the data engine are sent onto the renaming circuitry 34 inthe processor core pipeline.

The decoding section 44 of the data engine generates and transmitsoperations to a register renaming section 46 and these are then sent onto a dispatch/issue section 48 and finally to an execution unit 49. Thecore similarly has a dispatch/issue unit 38 and an execution unit 39.

The instruction decoder 44 of the data engine 24 will determine from theinstructions it receives which ones should be split intomicro-operations. In this regard, the data engine is designed such thatit can process some instructions more efficiently if it divides theminto several micro-operations. These micro-operations may need totransfer data between themselves and they use the additional registersof FIG. 1 to do this. These are often called temporary registers as theyare only used within an instruction. Thus, decode circuitry 44 willdetermine which of the partially decoded instructions it receives fromdecoder 32 should be split into micro-operations and will perform thistask. These operations will then be sent onto rename circuitry 46 alongwith operations generated by the instruction decoder 44 frominstructions that are not split.

Rename circuitry 44 and 46 both act to map registers from thearchitectural register set to registers from the physical register setand in the case of decoder 44 also from the additional registers to thephysical registers. Each renaming circuitry 34, 46 maps to physicalregisters from within their respective physical register banks 37 and47. These mappings are stored within rename tables in the renamingcircuitry 34, 46 and allow later operations that require access to thedata written to these registers to locate the actual physical registerswhere this data is held. Once the renaming circuitry 34, 46 has finishedrenaming the registers specified by the operation it will send it on tothe dispatch issue unit 38, 48 where it will then be sent forward to therespective execution units 39, 49. It should be noted that once theoperations reach the renaming circuitry 34, 46 in either pipeline theywill stay in order and will be executed in the order that they arereceived in. This means, that for the instructions that are split intomicro-operations one can be sure that once a first micro-operation isreceived no micro-operations from other instructions will be receiveduntil the final micro-operation pertaining to that split instruction haspassed through the renaming circuitry 46, unless an exception occurs.This feature means that the conditions that are required to be met forthe freeing of these registers for use in future mappings are simplerthan those for other operations that are generated directly frominstructions.

In this regard, although the mapping of registers by the registerrenaming circuitry 34, 46 allows for the use of more physical registersthan there are architectural register and allows some speculative andout of order execution, it is important that the register renamingcircuitry knows when a physical register that has been mapped becomesavailable for mapping again. In this regard, clearly the sooner they aremade available or are freed the fewer actual physical registers will berequired by the system.

Where the registers are architectural registers specified byinstructions then these instructions may execute out of order and it iscomplicated to determine when it is safe to free a register. Theadditional registers are used by micro-operations that are generatedwithin the decode circuitry and thus, will always execute in order anddetermining when they can be freed is simpler. In this regard thedetermination of when writing to an additional register has occurred andthe determination of when the micro-operations within a splitinstruction have all been received at the register renaming circuitrymay be sufficient to determine when the additional register can befreed.

Further constraints on the freeing of architectural registers occur inprocessors where speculative execution of instructions happens. In suchcases there is a need to be able to restore the processor to the stateit had before the speculative execution when an exception occurs and thespeculative execution should not have happened.

To address this the data processing apparatus 10 has exception circuitrywhich includes exception handling unit 50 which has an exception FIFO 52for storing information regarding the mappings of speculatively executedoperations executed by execute unit 39, and exception circuitry 51 whichhas an exception FIFO 53 for storing information relating to mappingsfor speculatively executed operations that are executed by execute unit49 within data engine 24. The exception handling unit 50 forwardsfeedback information about speculation to the data engine's exceptioncircuitry 51. The exception handling unit 50 also comprises the programcounter such that it controls the restart of program execution followinga mispeculation.

When registers are renamed in a data processing apparatus thatspeculatively executes instructions then if an exception occurs and thespeculatively executed instructions need to be aborted the processorwill need to rewind back to a point before the speculation started.Thus, the state of the processor at that point needs to be stored. Whereregisters have been renamed the mappings that were current at that pointneed to be available and in addition to this the actual physicalregisters used in these mappings should not have been remapped as thedata within these registers is required. Thus, exception FIFOs 52 and 53retain data relating to the mappings of speculatively executedinstructions and restore mappings for the registers that thespeculatively executed instructions remap. The registers that arepresent as restore mappings should not be freed as if an exceptionoccurs they will be required.

The exception FIFO 53 also contains information on any additional ortemporary registers that are currently mapped by the renaming circuitry.If an exception occurs then the instructions up to the firstspeculatively executed instruction are completed and then the renamingtable within the register renaming circuitry is updated with restoreregister mappings from the exception FIFO. Furthermore, the registerrenaming circuitry notes if there is an additional register present inthe exception FIFO and if there is then it masks the valid bit for thatphysical register such that the renaming circuitry 46 believes that thephysical register has been written. This means that this additionalregister can be freed when the other condition of the conditions set forthe additional registers has been met, the other condition being thatthe certain number of micro-operations following the micro-operationthat write to this register have passed through the register renamingcircuitry or a corresponding number of cycles have passed.

FIG. 3 shows register renaming circuitry 46 of FIG. 2 in more detail.There is renaming control circuitry 62 which receives the operationsfrom decode and determines from this when a register is to be writtento. If there is a register that is to be written then it will map thatregister to a physical register that it determines is available from theavailable register table 68 and it will update the renaming table 64. Ifthe register being mapped is already present in the renaming table thenit will overwrite the current mapping. If the register mapping that isoverwritten is for a temporary or additional register then it will setthe counter relating to that register to a predetermined value and itwill send a control signal to the register release circuitry indicatingthat the counter related to that mapping has reached the predeterminedvalue.

When a register that is mapped is an additional register then thecounter is set to the predetermined value and starts counting clockcycles. The register renaming circuitry expects to receive an operationat each cycle and thus, where this occurs the counter counting cycles isequivalent to it counting operations, which is what occurs in normaloperation. If the pipeline stalls for some reason such as for anexception then operations are no longer received every cycle. However,in such a case the instructions from which the micro-operations wereformed will not complete and thus, the fact that the counter indicatesthat the micro-operations have all been received does not matter as anymicro-operation to read the register that might not yet have beenreceived will not be executed.

Counters 65 and 66 relate to the two additional registers and each sendsignals to register release circuitry 67 which determines when registersthat have previously been mapped can be released and stored in availableregister table 68 to be reused in mappings.

The conditions for freeing the additional registers are that theregisters have been written to and this is indicated by receipt of avalid signal from the register bank for that register and that thecounter for that register has reached the predetermined value indicatingthat all of the micro-operations that might read from that register havepassed through the renaming circuitry. When these two conditions are metthe register release circuitry generates a signal indicating that thisphysical register is available for renaming and the available table 68is updated.

With regard to the counters reaching the predetermined value, this valuemay be set to the number of micro-operations that an instruction issplit into. However, in order to simplify the circuitry it may beconvenient to use a single predetermined value for all splitinstructions. This value could be set to be the maximum number ofmicro-operations that could occur between a micro-operation that writesto an additional register and a micro-operation that reads from it forall instructions that are split into micro-operations. Thus, the counteris set on receipt of a micro-operation that writes to an additionalregister and when it reaches this value one can be sure that all of themicro-operations that might want to read from the register have beenreceived at the renaming circuitry, whichever instruction was the sourceof the micro-operations. The micro-operations may all have been receivedsooner for some instructions that are split into fewer micro-operations,however, as the counter is updated for each cycle the additional numberof cycles required to count to the higher value will not cause muchdelay and may not cause any delay as the additional constraint of theregister having been written to will often occur later than the counterreaching the value. Where a micro-operation is received that writes toan additional register while the counter for that register is stillcounting then the counter is reset and a signal is sent to indicate thatthe counter has reached the required value, as the receipt of amicro-operation from a different instruction is an indication that themicro-operations generated from the previously split instruction musthave completed as they are received in order at the renaming circuitry.

Architectural registers that are remapped have more restrictiveconditions to be met before they can be released and thus, the registerrelease circuitry 67 needs to perform the following tasks: monitor theexception FIFO to determine if the physical register is a restoreregister within the exception FIFO; check the renaming table to checkthat the mapping is not present in the renaming table; check that theregister has been written to and thus a valid indication has come fromthe register bank; and monitor “snapshot” circuitry 69 which iscircuitry whose operation is triggered by the renaming control circuitry62 detecting a store instruction. A store instruction will store a valueof a register to memory. When accessing the memory the instruction maystall if the physical storage location to be accessed is not in thelocal virtual to physical address translation tables or if there areaccess permission problems. Thus, the writing of the value to the memorymay take a long time and the physical register holding the data must notbe remapped during this time. Thus, this snapshot circuitry is used totake note of store instructions and the condition of the core when theyare received and to monitor them as they move through the processingapparatus. This is an additional constraint that must be met by physicalregisters mapped to architectural registers that are to be freed byregister release circuitry 67.

An example of exception FIFO 53 of FIG. 2 is shown in FIG. 4. In thisexample there are instructions that lie between two branch instructionsshown. Branch instructions are speculatively executed instructions andare therefore instructions that may generate an exception. If anexception is generated by branch 1 Br1 then all of the instructions upto branch 1 must be completed and as they complete the registers thatare associated with them are removed from the exception FIFO 53. Theinformation remaining within the exception FIFO 53 is then used torestore the renaming table when the processing circuit is rewound tothis point. Thus, the recovery mapping shown in table 53 are used toupdate the renaming table. In this case as there was a VZIP instructionwhich is split into micro-operations and uses temporary registers temp0and temp1 the exception circuitry masks the valid bit associated withphysical register 35 and physical register 38 that the two tempregisters are mapped to, such that the register control circuitrybelieves that it has written to these registers and this condition ismet. The register release circuitry will therefore be able to free theseregisters once the counters related to the registers have reached therequired value. If the counters have already reached their values thenthey will be freed immediately.

FIG. 5 shows the masking of this valid indicator in more detail. Thus,there is in this Figure renaming control circuitry 62 which receivesdecoded operations and is clocked by a clock CLK1, there is a counter 65for temp register temp0 and a counter 66 for a temp register temp1.Register release circuitry 67 monitors the values in these counters andalso monitors the valid bit of the physical register that temp0 andtemp1 have been mapped to. When the counter related to a temporaryregister reaches its predetermined value and the valid bit for thatregister in register bank 10 is set to valid then the register releasecircuitry 67 will release the physical register that temporary registerwas mapped to. In the case that an exception occurs and one of thetemporary registers was in the exception FIFO 53 then control circuitry62 will generate a mask signal which will act with mask circuitry 69 tomask the value of the valid bit received for the physical register thatthe temporary register was mapped to and thus, the register releasecircuitry sees it as being set and this portion of the condition is metand the register can be released when the counter reaches the requiredvalue.

FIG. 6 shows an example of an instruction that is split intomicro-operations. This instruction is a VLD4 instruction and moves datathat is stored in memory into a register and reorders it at the sametime, the new order being more suitable for the processing operations.In order to do this it requires the use of temporary registers temp0 andtemp1 to store the data received from the memory while it is beingreordered and prior to writing it to the destination registers. As canbe seen the counter is set for each of the temporary registers when theyare written to and in this example the maximum number ofmicro-operations between a micro-operation writing to a register and alater micro-operation reading from that register is 3 and thus, thecounters are both set to 3. When each subsequent micro-operation isreceived the counter is decremented. When the counter reaches 0 and whenindication has been received that the temporary register has beenwritten to the physical register that the temporary register was mappedto can be released for remapping.

FIG. 7 shows a different example where data is moved from the memory andreordered. In this case, there are two micro-operations that are formedfrom the instruction and that use the temporary registers. However,although there are only two micro-operations formed, the counter isstill set to 3 and thus, when the micro-operations for the splitinstruction VLD2 have completed the counters have not been decrementedto 0 and further operations received by the register renaming circuitrywill cause the counter to be decremented further and the counter will intime reach 0 and a signal can be sent to the register release circuitry.It should be noted that although in these examples the counter is set tothe predetermined value and decrements to 0 in other embodiments itcould be set to 0 and be incremented to the predetermined value and acomparator could be used to determine when it reaches this value.

FIG. 8 shows an alternative embodiment where two VZIP instructions thatuse quad registers Q₀ and Q₁ are split into two micro-operations shownschematically as VOP1 and VOP2 which occur one after the other in theprogram stream. In such a case the temporary registers need to bewritten to by the second instruction before the counters for thoseregisters have reached 0. The register renaming circuitry will detectthe presence of the temporary registers in the renaming table and inresponse to this will overwrite them and will also signal to theregister release circuitry that the counters have reached apredetermined value. The counters will be reset for the nextinstruction. In this case, because the counters are set to the maximumvalue of micro-operations for a split instruction which in this exampleis three and because this instruction is only split into twomicro-operations the counter does not indicate that it is safe torelease the physical registers. However, the register renaming circuitryknows that it must be as it has received a write operation to the sameregisters and as the register renaming circuitry receives the operationsin order it knows that it would not receive this operation if the othermicro-operations had not completed. Thus, it can send a signal to theregister release circuitry that any read operations that might haverequired these registers have completed which is in effect what thecounter signal indicates.

FIG. 9 shows a flow diagram illustrating steps in the method forreleasing registers that have been renamed according to an embodiment ofthe present invention. Thus, a decoded operation is received and it isdetermined if the operation writes to a register. If it does write to aregister then it is determined if the register that is to be written tois already mapped in the renaming table. If it is then the previousmapping is updated with a new mapping and the valid bit of the physicalregister that has been mapped to is set to invalid. It is thendetermined if the register is from the architectural set. If it is notthen it is one of the additional registers and at this point the counteris reset for that register and an indication that the counter hasreached the required 0 value in this case for the physical register ofthe previous mapping is sent to the register release circuitry.

If the register written to was not already mapped in the renaming tablethen the renaming table is updated with the new mapping and the validbit set for the physical register is set to invalid and it is thendetermined if this register is from the architectural set. If it is notthis means it is one of the additional registers then the counter forthe register is set to the predetermined value. This counter is thendecremented in response to the clock signal.

If it is determined during these operations that the register is fromthe architectural set then the counter is not used and rather it isdetermined whether the first set of conditions that are used to freearchitectural registers are met. These include that the register hasbeen written to, that the register is not in the renaming table, that itis not a recover register and it is not in the “snapshot”.

For the additional register mappings it is determined if the counterreaches 0. When it does then it is determined if the valid bitassociated with the map physical register has been set. If it has thenthe conditions for the additional register have been met and thephysical register is marked as available in the available table. Asnoted previously there are conditions that occur, that mean that theadditional registers can be released before the counter has reached thepredetermined value or before the register is written, in such a casesignals are sent to the register release circuitry indicating that theseconditions have been met even where they have not. When the firstconditions are met for architectural registers then the physicalregister is marked as available in the available table.

FIG. 10 shows what happens when an exception is received. When theexception is received the instructions up to the first speculativeinstruction that generated the exception are completed and then therenaming table is updated with the restore register values from theexception table. It is then determined if there is a temporary registerwithin the exception FIFO. If there is then the control signal from theregister bank for the physical register that this temporary register ismapped to is masked to appear valid such that this condition from thesecond set of conditions is met.

Although illustrative embodiments of the invention have been describedin detail herein with reference to the accompanying drawings, it is tobe understood that the invention is not limited to those preciseembodiments, and that various changes and modifications can be effectedtherein by one skilled in the art without departing from the scope andspirit of the invention as defined by the appended claims.

We claim:
 1. Register renaming apparatus for a processing system havinga physical set of registers to store data values being processed and inwhich a stream of instructions from an instruction set is processed,said instructions specifying registers from an architectural set ofregisters, said register renaming apparatus comprising: first circuitryconfigured to receive a stream of operations from an instruction decoderwithin said processing apparatus and to map registers that are to bewritten to by said stream of operations to physical registers withinsaid physical set of registers that are currently available and toidentify additional registers that are registers that are to be writtento by said operations that are not from said architectural set ofregisters; and second circuitry configured to release physical registersthat have been mapped such that they are available for register renamingand to release said physical registers that have been mapped to saidregisters from said set of architectural registers when a first set ofconditions have been met, and to release said physical registers thathave been mapped to said additional registers when a second set ofconditions have been met, at least some of said conditions within saidfirst set of conditions being different to said conditions within saidsecond set of conditions.
 2. Register renaming apparatus according toclaim 1, wherein said stream of operations comprise at least somemicro-operations, wherein said micro-operations are generated by saidinstruction decoder by splitting at least one of said instructions fromsaid stream of instructions into a plurality of said micro-operations,at least one of said micro-operations specifying at least one of saidadditional registers.
 3. Register renaming apparatus according to claim2, further comprising at least one counter associated with said at leastone additional register, said first circuitry being configured inresponse to receiving a micro-operation that writes to said at least oneadditional register to commence counting, said second set of conditionsincluding said counter associated with said additional register havingcounted to a predetermined value.
 4. Register renaming apparatusaccording to claim 3, wherein said counter counts a number of clockcycles.
 5. Register renaming apparatus according to claim 3, whereinsaid counter counts a number of operations received at said firstcircuitry.
 6. Register renaming apparatus according to claim 3, whereinsaid predetermined value comprises a maximum number for all of saidinstructions that are split into micro-operations by said instructiondecoder, of micro-operations between a micro-operation that is writingto said additional register and a micro-operation that reads saidadditional register.
 7. Register renaming apparatus according to claim3, wherein said second set of conditions further comprises that saidmapped register has been written to, said second set of conditions beingmet when said counter associated with said additional register hascounted to said predetermined value and said physical register that saidadditional register was mapped to has been written to.
 8. Registerrenaming apparatus according to claim 7, wherein said physical registerscomprises a valid indicator associated with them, said first circuitrybeing configured to set said valid indicator to invalid on mapping aphysical register, said valid indicator being set to valid when saidregister is written.
 9. Register renaming apparatus according to claim3, wherein said first circuitry is further configured to determine whenan operation is received that writes to one of said additional registersthat is currently mapped and stored in a renaming buffer, said firstcircuitry being configured to generate a signal to indicate to saidsecond circuitry that said counter associated with said one of saidadditional registers has reached said predetermined value irrespectiveof a value of said counter and to reset said counter.
 10. Registerrenaming apparatus according to claim 1, further comprising: a renamingbuffer for storing a plurality of mappings of said registers specifiedby said operations received from said instruction decoder at saidregister renaming circuitry to said physical registers, said firstcircuitry being configured to add a new mapping to said renaming bufferon receipt of an operation specifying a register and on receipt of anoperation specifying a register that is already mapped in said renamingbuffer to remove said mapping from said renaming buffer and to updatesaid renaming buffer with a new mapping; and said first set ofconditions comprises that: said physical register has been written to:said physical register is not present in said renaming buffer; and saidphysical register is not stored in said data processing apparatus as arestore register for restoring a set of mapping that was valid at apoint prior to execution of speculative instructions.
 11. Registerrenaming apparatus according to claim 10, wherein said first set ofconditions further comprises that said physical register is not mappedto a source register of a decoded store instruction that has notcompleted.
 12. A data processing system comprising: a register bankcomprising a set of physical registers; at least one instruction decoderfor decoding a stream of instructions; register renaming apparatusaccording to claim 1; and at least one data processor for processingsaid stream of instructions.
 13. A data processing system according toclaim 12, wherein said at least one data processor comprises a dataengine; and said at least one instruction decoder is configured to splitat least one predetermined instruction from said stream of instructionsinto a plurality of micro-operations and to send said plurality ofmicro-operations to be processed by said data engine.
 14. A dataprocessing system according to claim 12, wherein said register bankcomprises a valid bit associated with each of said registers, said firstcircuitry being configured when mapping a physical register to updatesaid valid bit associated with said physical register to invalid andsaid data processing apparatus being configured to set said valid bit tovalid on writing to said physical register.
 15. A data processing systemaccording to claim 12, further comprising exception handling circuitryfor handling exceptions, said exception handling circuitry comprising:an exception data store for storing register mappings for registerswritten to by operations that are speculatively executed and restoremappings for said registers, such that if an exception occurs saidmapping can be restored to a previous state; said first circuitry beingresponsive to receipt of an exception to update said renaming tableusing said restore mappings from said exception data store, and todetermine whether at least one of said additional registers has beenmapped by said speculatively executed operations that are aborted and ifso to generate a signal to indicate to said second circuitry that saidsecond set of conditions is met when said at least one counterassociated with said at least one of said additional registers hascounted to said predetermined value.
 16. A data processing systemaccording to claim 12, further comprising an exception handlingcircuitry configured in response to detecting an exception to force anypending writes to said at least one additional register to completeprior to flushing operations from said processor.
 17. A method ofrenaming registers within a processing apparatus having a physical setof registers to store data values being processed and in which a streamof instructions from an instruction set is processed, said instructionsspecifying registers from an architectural set of registers, said methodcomprising: receiving a stream of operations from an instruction decoderwithin said processing apparatus and mapping registers that are to bewritten to by said stream of operations to physical registers withinsaid physical set of registers that are currently available; identifyingadditional registers that are registers that are to be written to bysaid operations that are not from said architectural set of registers;releasing said physical registers that have been mapped to saidregisters from said set of architectural registers such that they areavailable for remapping when a first set of conditions have been met,and releasing said physical registers that have been mapped to saidadditional registers such that they are available for remapping, when asecond set of conditions have been met, at least some of said conditionswithin said first set of conditions being different to said conditionswithin said second set of conditions.
 18. A method according to claim17, wherein said stream of operations comprise at least somemicro-operations, said method comprising an initial step of splitting atleast one of said instructions from said stream of instructions into aplurality of said micro-operations, at least one of saidmicro-operations specifying at least one of said additional registers.19. A method according to claim 18, wherein said method comprises a stepof in response to receiving a micro-operation that writes to said atleast one additional register commencing counting, said second set ofconditions for said at least one additional register including countingto a predetermined value.
 20. Apparatus for renaming registers for aprocessing apparatus having a physical set of registers to store datavalues being processed and in which a stream of instructions from aninstruction set is processed, said instructions specifying registersfrom an architectural set of registers said apparatus comprising: meansfor receiving a stream of operations from an instruction decoder withinsaid processing apparatus and for mapping registers that are to bewritten to by said stream of operations to physical registers withinsaid physical set of registers that are currently available; means foridentifying additional registers that are registers that are to bewritten to by said operations that are not from said architectural setof registers; means for releasing said physical registers that have beenmapped to said registers from said set of architectural registers when afirst set of conditions have been met; and means for releasing saidphysical registers that have been mapped to said additional registerswhen a second set of conditions have been met, at least some of saidconditions within said first set of conditions being different to saidconditions within said second set of conditions, wherein releasedphysical registers are available for register renaming.